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Abstract 

Typical quantum computing schemes require transformations ('gates') to be 
targeted at specific elements ('qubits'). In many physical systems, direct 
targeting is difficult to achieve; an alternative is to encode local gates into 
globally applied transformations. Here we demonstrate the minimum physical 
requirements for such an approach: a one-dimensional array composed of two 
alternating 'types' of two-state system. Each system need be sensitive only to 
the net state of its nearest neighbors, i.e. the number in state 'f minus the 
number in 'J,'. Additionally, we show that all such arrays can perform quite 
general parallel operations. A broad range of physical systems and interactions 
are suitable: we highlight two examples. 



Presently there is tremendous interest in the new field of quantum computation. Infor- 
mation is recognized as a physical quantity, with its representation and processing being 
governed by the laws of quantum mechanics. Rather than 'bits', the fundamental units of 
classical information theory, we instead employ 'qubits' which represent a general quantum 
superposition of '0' and '1'. A computation on a device containing N qubits is a sequence of 
unitary transformations within its 2 N dimensional Hilbert space. Researchers have already 
discovered quantum algorithms which exploit state superposition, entanglement and inter- 
ference in order to solve certain problems more quickly than any known classical procedure 
Two important cases are those of factoring large numbers, where the quantum device 
has an exponential speed advantage 0, and the task of searching among N elements, where 
a classical device requires time of order N but the quantum device requires only \H\f Q, or 
\/~N with a corrresponding size cost 

Efforts toward experimental realization of a quantum computer (QC) have focused prin- 
cipally on NMR and atomic trap implementations Numerous recent proposals have 
also drawn attention to possible solid state realizations @,@>fH- Typically such proposals 
demand manipulation of the Hamiltonian locally, on the scale of the individual component 
qubits. However this is not a fundamental requirement: it can be sufficient to apply only 
global manipulations to which all elements are subjected simultaneously. This would be 
a highly desirable economy for many implementations, because it would lower the number 
of channels by which the computer interacts with its environment, and hence reduce the 
decoherence rate. Moreover, it may enable new implementations where it is difficult or im- 
possible to perform individual addressing (e.g. quantum dot arrays may be driven by EM 
radiation of a wavelength far greater than the dot-dot separation). Lloyd has suggested one 
such model []9,10] based on a one- dimensional cellular automaton (CA). The model consists 
of a line of 'cells', where each cell is a quantum system possessing two long-lived internal 
eigenstates. The algorithm is represented by a series of update 'rules' which are applied 
globally to all cells, so that there is no need to address units individually. To realize LLoyd's 
CA model one would need to produce three 'types' of cell in the pattern ABC ABC..., and 
moreover one must find a means of applying asymmetric rules such as, "all cells of type A 
now invert their state if, and only if, the left neighbor is in state 1 and the right is in state 
0" . Clearly it is important to know if these are the minimum physical requirements. The 
present work demonstrates that they can in fact be relaxed significantly, to two cell types 
without the ability to distinguish the left neighbor from the right. These are the minimum 
requirements for any globally-driven system, given that we must have more than one cell 



'type' The simplifications enhance the practicality of the model; neighbor indistin- 

guishability is particularly significant in broadening the range of potential implementations, 
two of which we later discuss. This paper also provides a mechanism, compatible with any 
CA computer, for performing operations in parallel. We note the implications in terms of 
device size and speed. Parallelism may be essential for quantum error correction schemes to 
function efficiently [[12] . 

Our scheme consists of a two 'types' of cell, A and B, alternating along a one-dimensional 
array. Each cell has two internal eigenstates | j) and | |), and can represent any quantum 
superposition of these states. Each qubit of quantum information is represented by four 
consecutive cells: the qubit basis state |0) is represented by | ttll) whilst the state |1) is 
represented by | ||tt)- The basis states of a qubit X can therefore be compactly written 
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where x corresponds to J, if X = and | if X = 1, with the opposite applying to 
x. Figure 1(a) shows an array containing three qubits, each pair being separated by spacer 
cells in the | {) state (the minimum acceptable spacing is four cells but eight are used 
here for clarity). The array is subject to update rules specified by the notation which 
means, each cell of type A is subjected to unitary transform U if, and only if, its 'field' has 
value /. When the U is omitted a simple inversion is implied, | J.) T± \ ]). The 'field' is 
defined as the number of nearest neighbors in state | |) minus the number in state | j). This 
is the proper control variable since in a physical realization the cells will be aware of their 
neighbors through the net effect of, for example, their electrostatic fields. 

In classical computing we have the idea of a universal set of gates, i.e. a set of elementary 
operations (such as AND, OR, NOT) which are sufficient to represent any classical algorithm. 
In quantum computing the same concept applies. We first consider the general 'one-qubit 
gate', i.e. any chosen unitary transform applied to any particular qubit regardless of the 
other qubits in the computer. How can we single out one qubit, given that the array is 
structurally regular and our rules must be sent to all elements globally? One solution || 
is to introduce a 'control unit' (CU). Our CU is represented by six consecutive cells in the 
pattern Ttlltt> which exists only in one place along the array. In Figs. 1(b) and 1(c) we 
utilize the CU by applying updates which move it relative to the qubits, together with an 
update sequence which has a net effect only on the qubit nearest the CU. Clearly, by varying 
the update sequence we could have transformed another qubit, or indeed simply moved the 
CU without altering any of them. Thus we can implement a general one-qubit gate. For 
a universal set we require a two-qubit gate: the 'control-U' is more than sufficient. This 
gate applies the transform U to a certain qubit (referred to as the 'target') if, and only if, 
a second qubit (the 'control') is in state 1. Figure 2 shows the implementation of this gate 
schematically; an explicit depiction analogous to Fig. 1(b) is also available (TT3|. The 'cost' 
of restricting ourselves to global manipulations is now apparent: each qubit requires a total 
of eight physical cells (four for the encoding plus four spacers), and a 1-qubit gate requires 
about ten elementary pulses. These numbers would be somewhat smaller if we permitted 
ourselves more cell types and/or more complex interactions |J. 

To input information we may exploit the cells at the ends of the array ||, for whom 
the possible values of the 'field' variable are {1,-1} (in contrast to the {—2, 0, 2} values for 
all other cells). We can use the updates { A_i,Ai,B_i,Bi} to manipulate the end states, 
and the other updates to shift-load those states into the array. The means of output will 
depend on the available measurement techniques. If a cell on one end were associated with 
a measuring device, then one would first swap the qubit to be measured with the qubit 
nearest the end (by a series of three CNOTs, for example), then move it onto the end cell 
by the reverse of the input technique. A superior output procedure would be possible if, for 
example, the cells of type B had some third state exhibiting rapid spontaneous decay 
to the J, state. Then we could measure the state of a qubit anywhere along the array using 
the 1-qubit gate of Fig. 1(b) and choosing 
/00 1\ 

U= 1 in the basis {|, T,-^}. 

Vio ; 

If the subject qubit was previously in state 1, then the transformation would leave its 
representative cell in the unstable state From there it would decay back to [ with an 
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emission. The presence (or absence) of this emission could be detected and used to infer 
the state of the qubit. Repeated application of the transform would produce a stream 
of emissions (i.e. a fluorescence ), thus increasing the detection efficiency. Note that the 
existence of such a dissipative, irreversible transition may be essential in order to implement 
quantum error correction efficiently Furthermore, our chosen representation of the qubit 
basis states and the CU means that dissipitive transitions can be used to prevent these 
objects from delocalizing [ 13 ]. 



So far we have assumed that there is only one control unit (CU) in the computer. Con- 
sequently we cannot apply a gate at several points simultaneously. We could load an initial 
state containing P CUs distributed along the array (e.g. 1 every 20 qubits) 0, although we 
would then be constrained to apply exactly P identical gates simultaneously at every step, 
and always with the same spatial distribution. Completely general parallelism would allow 
us to apply a different number of simultaneous gates at each step, and at varying points 
along the array. How can this ideal be approached? We cannot directly create/annihilate 
CUs at specific locations because we are constrained to use global updates. Figure 3 depicts 
an alternative solution which is appropriate for any CA-like device. We increase the spacing 
between qubits considerably, and in each space we put a CU and a set of classical bits (using 
the same encoding employed for the qubits). Some of these classical bits encode a label, for 
example we might label each space uniquely using a binary number, and the others form an 
auxiliary 'work-pad'. Together the CU and the classical bits effectively constitute an entire 
computer in the space between each qubit; we will refer to these as 'sub-computers'. Now 
suppose we are 'running' a parallelized quantum algorithm, and the next step calls for a 
specific 1- or 2-qubit gate G to be applied simultaneously at P points along our array of iV 
qubits. This operation would require P CUs, located at just those points. However we ini- 
tially have one CU in each of the N 'sub-computers'. Therefore we send an update sequence 
which causes a computation [pl| simultaneously within each 'sub-computer': the label bits 
are the input and the output is a binary variable represented by some transformation ap- 
plied/not applied to the CU. This transformation disables the CU: when we subsequently 
apply updates to move the CUs away from their sub-computers to perform the gate opera- 
tion G on neighboring qubits, this will only occur where the CUs are untransformed. One 
such transformation is shown in Fig. 3(b). Having thus implemented the step required by 
our parallel quantum algorithm, we can now reverse the computation previously applied to 
the 'sub-computers' in order to return them to their initial state. 

There are costs and constraints associated with using this procedure. The size of the array 
must be increased by a factor / due to the inclusion of the 'sub-computers'; unique labeling 
would imply / of order ln(iV). The time r associated with the 'sub-computer' computation 
must be less than O(N) otherwise it would have been quicker to perform the parallel gates in 
series. One could not enable/disable a completely arbitrary sub-set of the N CUs under this 
time constraint ||16|| , so our procedure does not efficiently implement a completely general 
arrangement of gates. However, there are a great many useful distributions of CUs which 
do correspond to sufficiently fast r. The most obvious examples include: all CUs, a given 
CU, one in every TP CUs, all CUs in some interval. For these and many other patterns, r 
is merely of order ln(iV). An obvious variation is to place a 'sub-computer' only every ten 
qubits, say. Another is to 'nest' the procedure to provide parallel computation within the 
'sub-computers' at a cost of ln(ln(iV)). The process performed by the 'sub-computer' could 
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be generalized to apply a range of transformations to the CU, corresponding to different 
subsequent gates operations on the qubits. Most interestingly one could generalize the 
classical bits in each 'sub-computer' to qubits, so that the computation determining which 
CUs are disabled becomes a quantum process producing CUs in a superposition of the 
enabled/disabled state. It is unclear whether this could have significant advantages for 
algorithmic efficiency 

In all the procedures described above the cells are only sensitive to their immediate 
neighbors. In isolation a cell would have a certain energy gap between its two states; in 
the array environment this is split into distinct levels corresponding to the values of the 
field variable. However in a real system the cells would also be influenced by the states of 
non-neighboring cells, with the result that each level would be split into a multiplet of many 
levels. In order to drive a transition in reasonable time it would be desirable to address a 
multiplet collectively ||17|| ; this could only be achieved if the multiplets are non-overlapping. 
This condition translates to a constraint on how short-range the physical cell-cell interaction 
must be. It is easy to show that any one- dimensional system with a symmetric interaction 
(right and left neighbors indistinguishable) has multiplets that are well separated if the 
interaction is r~ 3 or shorter. Dipole elements such as nuclear or electron spins constitute one 
class of examples. Note that when the multiplets are well separated, it is possible to tolerate 
a degree of physical variation between cells which are nominally of the same type, since 
the effect of such variation is merely to broaden the multiplet correspondingly. Similarly, 
modest variations in the inter-cell spacing and coupling strength could be tolerated. Note 
also that if the interaction is not diagonal in the basis of the cell's states, the scheme still 
functions provided that the difference between the fundamental frequencies of the A and the 
B cells is large compared to the magnitude of any off-diagonal terms [H|. 

The present scheme is suited to systems where it is experimentally difficult or impossible 
to target specific units for manipulation. One example involves the nuclear magnetic reso- 
nance (NMR) approach |J which has been successfully used to realize 3-qubit computers. 
Here the computer is a molecule possessing a number of spin-non-zero nuclei, the states of 
which are used to represent the qubits. Probably the most fundamental obstacle preventing 
experimentalists from extending the number of qubits, N, is the difficultly of distinguishing 
N unique sets of energy levels. This obstacle is removed by our model, which could be 
realized by a linear molecule with A and B sites alternating along its length: we need not 
distinguish between any two sites of a given type, hence we have only two fixed sets of energy 
levels regardless of N. For a second example, consider the solid state realization recently 
proposed by Kane [0]. Here qubits are again realized by the states of nuclear spins, but these 
belong to donor impurity atoms embedded in Si. In order to gain control over specific qubits, 
Kane introduces a set of electrostatic gates located near the donors, with two gates being 
required for each donor. These electrodes represent both a principal source of decoherence 
in the system, and a major difficulty for experimental realization. By switching to the model 
presented here, where there is no need to address qubits individually, the essential role of 



the electrodes is removed. If, as seems entirely plausible their remaining functions can 
be obviated by design modifications, then it will be possible to dispense with them entirely. 

To conclude, we have exhibited a model of quantum computation which requires only 
global manipulations and yet has very modest physical requirements. We have shown that it 
is possible to efficiently perform non-trivial parallel operations on such a model. The model 
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operates with interaction ranges as great as r~ 3 , and is thus applicable to a wide range of QC 
implementations where it may significantly reduce the obstacles to experimental realization. 

The author would like to thank Seth Lloyd, Mike Mosca and Wim Van Dam for useful 
discussions. This work was supported by an EPSRC fellowship. 
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Fig.l A section of the array containing the control unit (CU) and three qubits, X,Y & Z, 
each encoded over four cells. All other cells are in state 'J,'. White cells are of type A, shaded 
cells of type B. (a) The effect of the update B : the CU moves one cell to the left, all the 
qubits move one cell to the right, yet the form of the qubits and the CU are preserved, (b) 
& (c) A general 'one-qubit' gate. For clarity the states 'j' are written as '— '. In response 
to the updates A ,B ,Aq,B ... the CU passes through qubit X, leaving it unchanged, and 
continues until mid-way through passing qubit Y. Then additional updates are applied: the 
effect of the last is to apply a unitary transform U only to the cell representing the Y qubit, 
yielding qubit T = U.Y. As indicated in (c), re-applying the updates in reverse order then 
moves the CU moves away from T. 

Fig. 2 Schematic of the 'two-qubit' control-U process. The target qubit is S, and the control 
is Y. The CU moves transparently past the Z qubit, and continues until mid-way through 
passing Y. To this point is the process is identical to Fig 1(c), however now the CU itself 
is subject to a transformation: it is altered from tTIITT to T T T T T T ifj an d only if, Y = 0. 
Both forms of the CU will pass transparently through the intervening qubits W,X in answer 
to the same update sequence. When qubit S is reached a new sequence is applied, the last 
of which subjects S to a transform U if, and only if, the CU arrived in its unaltered form. 
Finally we re-apply all the updates preceding the last in reverse order so as to return the 



CU to its initial state. An explicit depiction of the process is available from Ref. [13 



Fig. 3 (a) Generalization from the simple serial model (i) to the parallel model (ii) employing 
'sub-computers', (b) One means of disabling the CU simply by delaying it. The CU is 
delayed (or not delayed) depending only on the states of four auxiliary bits, which have 
been set by the proceeding 'sub-computer' computation. The sequence of updates applied 
is the same for both cases. The delayed CU is in an 'empty' region of the array when the 
non-delayed CU has reached its target qubit, here denoted Q. An explicit depiction of the 



process is available from Ref. fl3 |. 
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NOT INTENDED FOR JOURNAL PUBLICATION 

Web: FigureA.pdf. The explicit description of the control-U process which is shown 
schematically in Figure 2. Note that the update sequence is the same in both parts of the 
Figure, but only in the Y = 1 case does the last update, , have an effect. After this 
update has been applied, the preceding updates would be re-applied in reverse order to 
complete the process. 

The two blue rows show the control unit having moved from the X qubit to the W 
without changing its form. It follows that the CU could cross any number of intervening 
qubits to reach its target. 

Note: in this Figure we use only 4 spacer cells between each qubit; this is the minimum 
that permits qubit gate operations. A consequence of this tight packing is that one of the 
neighbors of the target qubit must be disturbed during the operation (here W is disturbed). 
However the final update, , affects only the target qubit and hence the disturbance of 
the neighbor is undone when the preceding updates are re-applied in reverse order. 

Web: FigureB.pdf. An explicit depiction of the delaying transformation which is shown 
schematically in Figure 3(b). Note that the transform is applied/not applied to the control 
unit depending only on the values stored in the four auxiliary bits, i.e. the update sequence 
is the same for both cases. The auxiliary bits will have been set by the proceeding 'sub- 
computer' calculation. 

Web: FigureC.pdf. The most compact implementation of the important Control- Control- 
U gate. The transformation U is applied to the target qubit W if, and only if, both the 
control qubits X and Y are in state 1. 
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